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Abstract: This paper discusses the new problems faced by multi-type data fusion, discusses the relationship available for fusion in 
multi-type data from the analysis of correlation, puts forward the concept of data type attribute correlation, and summarizes the 
research status of data fusion based on correlation. The general process of multi-type data fusion research is given, and the relevant 
literature is summarized according to the process. The model skillfully combines the hierarchical structure of wireless sensor networks 
and neural networks, and designs each cluster as a three-layer perceptron neural network model. The feature data is extracted from the 
large amount of raw data collected by the neural network method, and then the feature data is sent to the aggregation node. 
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1. INTRODUCTION 


As a new network system, wireless sensor network (WSN) 
has the characteristics of low cost, high precision and easy 
operation, which gradually highlights its position in science 
and technology. In wireless sensor network, a large number of 
sensor nodes continuously collect field data according to the 
set period, and these data have the following characteristics: 
(1) The data collected by a single sensor node many times in a 
short time has high similarity; (2) The data collected by the 
neighboring sensor nodes at the same time have high 
similarity. Data fusion is one of the important research fields 
in wireless sensor networks. 


Using data fusion technology can effectively overcome the 
energy limitation in wireless sensor networks. By merging 
data from multiple data sources and removing redundant 
information, data fusion can effectively reduce the amount of 
data transmission in the network. In WSN applications, each 
node collects a large amount of sensing data. Suppose that in 
the wireless sensor network system, the node collects the 
humidity information of the surrounding soil environment 
every 30s, and generates 120 sensing data every hour, and 
2880 sensing data will be generated in a day, The amount of 
data collected by nodes is very large 


If multi-dimensional monitoring data is considered, the sharp 
increase in the amount of data transmitted by nodes will lead 
to the depletion of node energy, The sensing data of these 
nodes are highly similar or even the same in the physical type 
attributes expressed, so that the multi-source data has a certain 
degree of spatial correlation in the physical space. At the same 
time, due to the continuity of physical phenomena and the 
limitation of query operations in wireless sensor networks, the 
data values obtained by continuous sampling of nodes also 
have certain similarity, which is called time correlation. The 
research of data type attribute correlation is another important 
way following the research of time-space correlation. 


Division of sensor nodes hops in network is shown below. 


Effectively reduce the number of packets to reduce network 
energy consumption. This is a processing technology for 
source coding and the earliest data fusion technology. The 
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NNBA model is designed for real-time monitoring 
applications, such as forest fire real-time monitoring network, 
large greenhouse monitoring network, etc. In this kind of 
application scenario, the sensor node continuously collects 
some environmental indicators, such as temperature, 
humidity, light intensity, and transmits the collected data to 
the sink node. 


Figure. 1 Division of sensor nodes hops in network (image from 
Google) 


2. THE PROPOSED METHODOLOGY 


2.1 Correlation characteristics of multi- 
type data 


First of all, data type attribute correlation is different from 
space-time correlation, and is a new correlation independent 
of space-time correlation. The correlation of monitoring data 
type attributes depends on the inherent attributes of the 
monitoring target's own type. The data type attributes are 
interdependent, converted or equivalent to each other. 
Krishnamachari and Estrin and other researchers have 
conducted in-depth research on the impact of data fusion 
technology in wireless sensor networks. The research results 
show that the impact of data fusion technology on the system 
is mainly manifested in two aspects: saving energy 
consumption and increasing delay time. 
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In fact, this is an irreconcilable contradiction. That is to say, 
data fusion can save energy consumption, but inevitably 
increase the delay time of data transmission. The data funnel 
is essentially a cluster-based data fusion. The boundary node 
is equivalent to the cluster head node, and the sensor node 
belongs to the intra-cluster node. The cluster head node is 
responsible for merging the data packets of the nodes in the 
cluster. The encoding algorithm based on data order can 
further compress the size of data packets. For formula (13), 
select the same Gaussian kernel function, and the induction 
data fitting curve is shown in Figure 2. When using formula 
(10) to approximate the data, there may be large deviation at 
the boundary or individual positions, because there may be no 
data associated with them when using the kernel method to 
calculate the weight value, The fitting results in the boundary 
region by using the local polynomial kernel regression method 
are good. The study of the correlation of data type attributes is 
another important way to follow the study of time-space 
correlation. 


2.2 Energy model of sensor nodes 

First of all, data type attribute correlation is different from 
space-time correlation, and is a new correlation independent 
of space-time correlation. The correlation of monitoring data 
type attributes depends on the inherent attributes of the 
monitoring target's own type. The data type attributes are 
interdependent, transformed or equivalent to each other. The 
traditional data fusion model is generally data fusion 
independent of data. There are nodes specially responsible for 
data fusion for data fusion, which is passive data fusion. Such 
data fusion is generally implemented during data 
transmission. This data fusion technology does not consider 
the correlation of data. After collecting the data, the node 
responsible for data fusion will sort out the collected data 
information and merge or discard some data with low 
reliability to reduce network energy consumption and improve 
data accuracy. 


Neural network and data fusion have a common basic feature, 
that is, through certain operations and processing of a large 
number of data, we can get conclusive results that can reflect 
the characteristics of these data. Therefore, neural network 
can be used to realize and solve the problem of data fusion. 
Energy is an important resource in wireless sensor networks, 
and the main role of data fusion is to save energy. Therefore, 
it is very necessary to establish the energy model of sensor 
nodes and quantify the impact of data fusion on the energy of 
sensor nodes and the impact on the lifetime of wireless sensor 
networks. 


By analyzing the correlation between the sampling points and 
the observed data, the induction data can be well 
approximated by linear regression. However, in practical 
problems, the relationship between the data is often not linear, 
and the direct use of linear regression model will fail. This 
requires selecting a relatively close curve to fit the data 
according to prior knowledge, and linearizing the nonlinear 
equation through transformation, Then the least square 
method is used to solve the linear regression equation. The 
data independent fusion algorithm generally includes two 
types of compression techniques, one is in the source coding. 
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3. CONCLUSION 


This paper discusses the problems faced by current data 
fusion research, such as the complexity of data correlation 
redundancy and the difficulty of data fusion; The research 
status of data fusion is summarized; The concept of data type 
attribute correlation is proposed, which is independent of 
time-space correlation and has certain space-time 
characteristics. This paper points out that data prediction is an 
effective way to solve multi-type data fusion, data packet 
merging and model-driven data fusion. In practical 
applications, data fusion needs to be combined with MAC 
protocol, data-centric routing, network topology and other 
factors to conduct cross-layer design and optimization, so as 
to obtain the optimal energy benefits. 
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